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SUMMARY 


This  report  documents  the  experimental  and  theoretical 
approaches  taken  in  developing  the  Nonparametric  Percentile  (pro¬ 
gram  NPPCTL)  computer  program,  and  illustrates  the  developed 
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gram  in  addition  to  the  source  code  listing. 
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SECTION  1 
INTRODUCTION 


Estimating  percentiles  is  a  very  important  statistical  tool 
for  relating  an  individual  to  a  population.  For  example,  the  per¬ 
centiles  of  anthropometric  measurements  are  very  important  in  de¬ 
signing  work  stations  and  clothing  items.  Since  it  is  often  im¬ 
possible  to  design  these  items  to  fit  all  population  personnel  with¬ 
out  modification,  the  usual  procedure  is  to  design  for  a  range  of 
values,  for  example  in  aircraft  crew  station  design,  from  the  5th 
percentile  to  the  95th  percentile.  The  most  commonly  used  method 
for  estimating  percentiles  is  the  Gaussian  method  based  on  the 
assumption  that  the  population  is  normally  distributed.  However, 
nonnormally  distributed  parameters  do  exist  such  as  age,  body  skin¬ 
fold,  strength,  endurance,  and  reaction  time. 

Edmund  Churchill  (1981)  evaluated  different  methods  of  esti¬ 
mating  percentiles.  Thirteen  methods  of  computing  percentiles 
from  large  samples  were  examined  using  100  random  samples  of  each 
of  ten  variables:  age,  weight,  stature,  sitting  height,  hip  breadth, 
hand  length,  subscapular  skinfold,  chest,  buttock,  and  head  circum¬ 
ferences.  The  samples'  values  were  chosen  from  the  1967  U.S.  Air 
Force  Flying  personnel  anthropometric  survey.  No  one  method  was 
clearly  superior  to  all  others.  All  methods  analyzed  were  unsatis¬ 
factory  with  badly  skewed  data  such  as  age;  however,  nonparametric 
estimates  were  not  studied  there. 

To  compute  the  percentiles  of  skewed  data,  a  "Nonparametric 
Method"  using  a  nonparametric  estimate  of  the  probability  density 
function  was  developed.  A  nonparametric  procedure  is  a  statistical 
procedure  which  is  valid  irrespective  of  the  type  of  the  probability 
distribution  function  from  which  the  sample  is  obtained. 

For  this  study  three  subsets  of  the  age  data  from  the  1967 
Anthropometric  Survey  of  U.S.  Air  Force  Flying  personnel  are  con¬ 
sidered.  For  the  first  subset,  ten  randomly  selected  samples  of 
size  200  are  drawn  without  replacement  from  a  population  of  2420 
observations.  Also  drawn  without  replacement,  for  the  second  and 
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third  subsets,  are  ten  randomly  selected  samples  of  sizes  150  and 
100  respectively.  The  percentile  estimates  are  computed  using  the 
Gaussian  method  and  the  nonparametric  method.  The  average  computed 
percentiles,  and  the  individual  computed  percentiles  are  compared 
to  the  actual  percentiles  of  the  total  population  from  which  the 
data  samples  are  drawn.  The  actual  percentiles  of  the  total  popu¬ 
lation  are  computed  using  the  well  known  counting  procedure. 

We  observed  that  the  nonparametric  method  outperforms  the 
Gaussian  method  for  skewed  data,  when  estimating  the  5th,  15th, 

25th,  35th,  45th,  50th,  65th,  75th,  85th,  and  95th  percentiles. 

This  report  describes  the  basic  equations  used  in  developing 
the  computer  program  for  the  nonparametric  method  in  addition  to 
the  source  code  listing.  It  also  contains  the  examples  used  to 
illustrate  the  method,  and  explains  the  use  of  the  program. 
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SECTION  2 


THE  NONPARAMETRIC  ESTIMATE  OF 
THE  PROBABILITY  DENSITY  FUNCTION 


Let  X^,  X2,...Xn  be  a  random  sample  of  size  n.  Assume  that 
the  probability  density  function  [f(x)J,  of  the  population  from 
which  the  random  sample  is  drawn,  is  unknown.  Then  the  estimator 
[fn(x) ] ,  of  the  probability  density  function  [ f ( x) ] r  may  be  repre 
sented  by  the  following 


f 


n 


(x) 


1 

n 


n 

Z 

i-1 


Kn(x,X.) 


(1) 


where  n  is  the  sample  size,  X^  is  the  ith  observation,  and  Kn(x,  X^) 
is  the  smoothing  function  or  the  kernel .  The  idea  of  the  estimator 
of  the  probability  density  function  is  the  following.  The  empirical 
distribution  function  is  a  discrete  distribution  with  mass  placed 
at  each  of  the  observations.  The  formula  in  (1)  smooths  this  prob¬ 
ability  out  continuously,  smoothing  according  to  the  choice  of 
K  (x , X • ) .  Thus  the  choice  of  K  (x, X. )  is  very  important  and  to 

ii  ^  1  i  JL 

a  large  extent  determines  the  properties  of  fn(x).  The  smoothing 
function  used  here  is 

x-X. 

Vx'xi>  ■  JS  e  <2> 


where  h  is  a  selected  function  of  the  sample  size  (n)  such  that 
h-*-0,  at  an  appropriate  rate,  as  n-*».  Of  course  the  problem  is  to 

choose  the  function  h  =  h(n)  converging  to  0  at  an  appropriate  rate. 

01  1 
If  h  =  cn  ,  a>0  the  optimum  choice  of  a  is  —.  The  optimum  value 

of  c  is  a  function  of  the  probability  density  function  [f(x)],  but 

since  we  are  attempting  to  estimate  f(x),  it  is  unlikely  that  we 

will  know  enough  to  choose  an  optimum  c.  Nonetheless,  choosing  the 

constant  c>0,  to  be  the  standard  deviation  of  the  sample  data, 

will  be  satisfactory.  Thus 
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h  =  sn 


1 

5 


(3) 


where  s  is  the  standard  deviation  of  the  random  sample.  Thus,  the 
nonparametric  estimator  of  the  probability  density  function  is 


fn(x) 


1 

2nh 


n 

I 

i=l 


x-X, 


— ! ) 


-00  <  x  <a> 


(4) 


If  the  random  sample  is  arranged  in  order  of  magnitude,  then 
the  y r th  percentile  is  the  value  of  x  such  that  y -  percent  of  the 
observations  is  less  than  the  value  of  x  and  (100-y.J  percent  is 
greater.  That  is  y.  is  the  (100)(£)th  percentile  if 


P  [  X  <  Y  r  ]  =  C 


(5) 


where  P[x<_y.]  is  the  probability  distribution  function.  But 


P[x<yr]  =  / 


fn(x)  dx 


(6) 


Therefore 


C  -  /  f n (x)  dx 


(7) 
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x-X. 


,  s  i  n  •  '  h 
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x-X. 
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2nh 
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i=l 


dx 
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i=l 


1 

-[j£(|x-X.|)] 
e  dx 


(8) 


The  developed  program  uses  an  iterative  procedure  to  find 
Yj.  which  is  the  nonparametric  estimate  of  the  (lG0)(£)th  percentile. 

The  program  computes  the  percentiles  of  the  sample  data  using 
both  the  Gaussian  method  and  the  nonparametric  method.  For  the 
Gaussian  method  the  following  equation  is  used: 


5  V  v  2 

?  =  f  -L-  e-V2  dx 

-«  c/Tir 

Where  y ^  is  the  (100) (£)th  percentile,  a  is  the  standard  deviation, 
and  X  is  the  mean. 
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SECTION  3 
THE  STUDY 


The  design  of  this  study  is  basically  experimental  rather  than 
theoretical.  The  results  reported  in  this  report  are  obtained  by 
randomly  selecting  samples  of  different  sizes  from  skewed  data  (1967 
USAF  Survey  age  data) . 

In  the  1967  Survey  of  USAF  Flying  Personnel  conducted  by  the 
Air  Force  Aerospace  Medical  Research  Laboratory  (see  Churchill, 
et  al.,  1977),  185  variables  were  measured  and  recorded  for  2420 
male  pilots.  For  this  study  three  subsets  of  sizes  200,  150,  and 
100  of  the  age  data  are  considered.  For  each  subset  ten  randomly 
selected  samples  are  drawn  without  replacement  from  the  population 
of  2420  observations. 

The  5th,  15th,  25th,  ...,  50th,  65th,  ...,  and  95th  percentile 
estimates  are  computed  using  the  Gaussian  method  and  the  nonparametric 
method.  The  average  nonparametric  percentile  estimates  and  Gaussian 
estimates  are  computed  for  each  of  the  three  subsets  considered  in 
this  study.  The  average  computed  percentiles  from  both  methods  are 
compared  to  the  corresponding  population  percentiles.  The  popula¬ 
tion  percentiles  are  computed  using  the  well  known  counting  method. 

The  criteria  used  for  comparing  the  Gaussian  and  nonparametric 
methods  are  as  follows.  The  estimates  of  the  percentiles  should  be 
close  to  the  corresponding  percentiles  of  the  population  from  which 
the  data  sample  is  drawn.  That  is  the  estimate  of  the  1st  percentile 
should  be  close  to  the  population  1st  percentile,  the  estimate  of 
the  2nd  percentile  should  be  close  to  the  population  2nd  percentile, 
etc . 

The  total  population  arithmetic  mean  is  30.03  years,  the 
standard  deviation  is  6.31  years,  and  the  measure  of  skewness,  using 
the  third  moment  about  the  mean,  is  0.76.  The  actual  percentiles 
and  the  computed  percentiles  for  the  total  population  (2420  observa¬ 
tions)  using  both  the  Gaussian  and  nonparametric  methods  are  shown 
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in  Table  1.  Also  shown  in  Table  1  is  the  difference  between  each 
population  percentile  and  each  corresponding  percentile  estimate 
expressed  as  a  percent  of  the  actual  percentile  (A%)  .  Table  2 
shows  the  population  percentiles  for  all  2420  observations,  the 
average  nonparametric  percentiles  estimates,  and  the  average  Gaussian 
estimates  from  the  ten  randomly  selected  samples  of  size  200.  The 
population  percentiles,  the  average  nonparametric  estimates,  and 
the  Gaussian  estimates  from  the  ten  randomly  selected  samples  of 
sizes  150  and  100  are  shown  in  Tables  3  and  4  respectively.  Also 
shown  in  Tables  2,  3,  and  4  is  the  difference  between  every  popu¬ 
lation  percentile  and  the  corresponding  percentile  estimates  ex¬ 
pressed  as  a  percent  of  the  actual  percentile  (A%) . 

Now  let  us  consider  the  performance  of  the  nonparametric  method 
described  in  Section  2  of  this  report  with  that  of  the  Gaussian 
method.  As  shown  in  Table  1,  the  nonparametric  method  outperforms 
the  Gaussian  method  when  estimating  the  5th,  25th,  35th,  45th,  50th, 
55th,  and  95th  percentiles.  Using  all  2420  observations  it  is  ob¬ 
served  from  Tables  2,  3,  and  4,  that  the  nonparametric  method  out¬ 
performs  the  Gaussian  method  when  estimating  the  5th,  15th,  25th, 
35th,  45th,  50th,  65th,  and  95th  percentiles  for  sizes  200,  150,  and 
100  respectively.  It  is  also  observed  that  the  nonparametric  method 
is  superior  to  the  Gaussian  method  at  the  lower  half  of  the  distri¬ 
bution  since  the  data  are  skewed  right  (positive  skewness) . 

In  order  to  test  the  performance  of  the  nonparametric  method 
with  that  of  the  Gaussian  method  when  dealing  with  different  types 
of  data,  the  AFAMRL  unpublished  strength  data  (weight  holding 
in  seconds)  are  considered.  The  1st,  2.5th,  5th,  10th,....,  95th, 
97.5th,  and  99th  percentiles  are  computed  using  the  counting  proce¬ 
dure,  the  Gaussian  method,  and  the  nonparametric  method.  The  total 
population  size  is  1,066  observations,  the  arithmetic  mean  is  53.33 
seconds,  the  standard  deviation  is  22.11  seconds,  and  the  measure 
of  skewness,  using  the  third  moment  about  the  mean,  is  0.95.  Table 
5  shows  the  population  percentiles,  Gaussian  estimates,  and  non- 
parametric  estimates  for  the  total  population  (1,066  observations). 
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TABLE  1 

•POPULATION  PERCENTILES,  GAUSSIAN  ESTIMATES, 

AND  NONPARAMETRIC  ESTIMATES  FOR  THE  TOTAL  POPULATION 
(n=2420)  FOR  THE  AGE  DATA 


POPULATION  PERCENTILES,  AVERAGE  GAUSSIAN  ESTIMATES, 
AND  AVERAGE  NONPARAMETRIC  ESTIMATES  FOR  TEN  SAMPLES 
OF  SIZE  200  FOR  THE  AGE  DATA 
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POPULATION  PERCENTILES,  AVERAGE  GAUSSIAN  ESTIMATES, 
AND  AVERAGE  NONPARAMETRIC  ESTIMATES  FOR  TEN  SAMPLES 
OF  SIZE  150  FOR  THE  AGE  DATA 


POPULATION  PERCENTILES,  AVERAGE  GAUSSIAN  ESTIMATES, 
AND  AVERAGE  NONPARAMETRIC  ESTIMATES  FOR  T^N  SAMPLES 


TABLE  5 


POPULATION  PERCENTILES,  GAUSSIAN  ESTIMATES, 
AND  NONPARAMETRIC  ESTIMATES  FOR  THE  STRENGTH 
DATA  (WEIGHT  HOLDING  IN  SECONDS) 


Percentile 

Population 

Percentile 

Gaussian 

Estimates 

Nonparametric 

Estimates 

1.0 

10.00 

1.90 

6.24 

2.5 

15.00 

9.99 

12.43 

5.0 

20.00 

16.96 

18.09 

10.0 

27.00 

24.98 

25.40 

15.0 

32.00 

30.42 

30.42 

20.0 

35.00 

34.71 

34.40 

25.0 

38.00 

38.43 

37.83 

30.0 

42.00 

41.74 

41.01 

35.0 

45.00 

44.82 

43.96 

40.0 

47.00 

47.74 

46.67 

45.0 

50.00 

50.55 

49.27 

50.0 

52.00 

53.33 

51.82 

55.0 

54.00 

56.12 

54.41 

60.0 

56.00 

58.93 

57.06 

65.0 

59.00 

61.85 

59.87 

70.0 

62.00 

64.92 

62.93 

75.0 

65.00 

68.24 

66.43 

80.0 

69.00 

71.95 

70.55 

85.0 

74.00 

76.24 

75.77 

90.0 

81.00 

81.68 

82.68 

95.0 

90.00 

89.71 

93.00 

97.5 

101.00 

96.67 

103.50 

99.0 

113.00 

104.77 

117.52 
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As  with  the  age  data,  the  nonparametric  method  is  superior  to  the 
Gaussian  method  especially  at  the  lower  end  of  the  distribution. 

In  summary,  based  on  the  comparison  shown  in  this  report, 
the  nonparametric  method  is  superior  to  the  Gaussian  method  at 
the  lower  half  of  the  distribution  since  the  data  are  skewed 
right  (positive  skewness) .  The  criteria  used  for  comparing  the 
two  methods  are  as  follows.  The  estimates  of  the  percentile 
should  be  close  to  the  corresponding  percentiles  of  the  population 
from  which  the  data  sample  is  drawn. 

During  this  study  different  sample  sizes  of  the  age  data 
and  other  anthropometric  dimensions  were  considered  and  the  re¬ 
sults  were  examined.  For  small  samples  (n  £  100) ,  neither  of 
the  two  methods  was  superior  to  the  other.  But  for  samples  greater 
than  100  the  nonparametric  method  is  superior  to  the  Gaussian 
method  for  skewed  data.  The  degree  of  performance  of  the  non¬ 
parametric  method  was  proportional  to  the  amount  of  skewness. 

Finally,  when  there  is  substantial  reason  to  believe  that 
the  sample  was  drawn  from  a  skewed  population  (that  is,  where 
the  third  moment  about  the  mean  is  >0.6),  the  nonparametric 
method  provides  a  better  estimate  of  population  percentiles. 

More  effort  is  needed  to  examine  the  possibilities  of  using  the 
method  for  nonskewed  data  (e.g.  normally  distributed  data) ,  and 
negatively  skewed  data. 


SECTION  4 

USING  PROGRAM  PRCNTLS 

Program  PRCNTLS  is  written  in  CDC  EXTENDED  FORTRAN  IV  and  can 
be  run  on  most  large  mainframe  machines  with  minimal  modifications. 
On  a  CDC  175,  47K  octal  words  of  memory  were  required  for  execution. 
The  program  is  designed  to  compute  the  nonparametric  percentile 
estimates,  Gaussian  percentile  estimates,  and  the  true  population 
percentiles  (optional) .  The  nonpar ame trie  percentile  estimates 
are  computed  using  the  method  described  in  Section  2  of  this  report. 
The  Gaussian  estimates  are  computed  using  the  following: 

,  _  ^  1  -1/2  (— )2  , 

?  -  J  -  e  a  dx 

-°°  a/2if 

Where  y r  ;j  the  (10Q)(C)th  percentile , y  ^  is  the  standard  de¬ 
viation,  and  x  is  the  mean. 

The  population  percentiles  are  computed  using  the  counting 
procedure.  Tht  .at a  are  arranged  in  order  of  magnitude,  and  then 
are  groupea  into  convenient  class  intervals.  Then,  the  number  of 
observations  oelow  each  upper  class  limit  are  counted,  divided  by 
the  total  number  of  observations,  and  multiplied  by  100  to  determine 
the  percentile  rank. 

4.1  THE  PROGRAM  OUTPUT 

Program  PRCNTLS  writes  to  UNIT  6  and  contains  the  following 
(see  Figure  1) : 

(1)  the  variable  name, 

(2)  the  survey  name, 

(3)  the  arithmetic  mean  for  that  variable, 

(4)  the  standard  deviation, 

(5)  the  sample  size, 

(6)  the  Gaussian  percentile  estimates, 

(7)  the  nonparametric  percentile  estimates,  and 
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Fiyure  1.  Program  PRCNTI.S  Sample  Output. 
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(8)  optionally,  the  actual  population  percentiles  using  the 
counting  method. 

Population  percentiles  by  the  Counting  Method  are  included 
to  show  the  user  of  the  program  how  well  the  two  percentile  esti¬ 
mation  techniques  fared  on  his  data. 

4 . 2  PROGRAM  INPUT 

The  input  to  program  PRCNTLS  is  read  from  Unit  5  and  consists 
of  the  following: 

•  the  variable  name, 

•  the  survey  name, 

•  the  sample  size, 

•  the  counting  method  indicator  (1  if  the  percentiles  by  the 
counting  method  are  desired;  0  if  not) , 

•  the  data  format,  and 

•  the  data  itself. 

As  many  sets  of  input  as  desired  may  be  run  together,  ending  with 
either  a  blank  card  or  an  end-of-file  (EOF) .  The  general  data  deck 
layout  is  shown  in  Figure  2.  The  data  format  is  as  follows: 

•  The  variable  name  and  survey  name, 

columns  1-30  the  variable  name  (3A10) 
columns  41-70  the  survey  name  (3A1Q) 

•  The  sample  size  and  counting  method  indicator, 

columns  1-5  the  sample  size  (15) 

columns  7  the  counting  method  indicator  (12) 

•  The  data  format, 

columns  1-80  the  data  format  enclosed  in  parenthesis 
(8A10) 

•  The  data  as  specified  in  the  data  format. 

Figure  3  is  the  input  example  that  produced  the  output  of  Figure  1. 


20 


o 

tn 

O' 

UJ 

CL 


>- 

u. 


N. 

vO 

O' 


00 

O' 


UJ 

L9 


(M 


-T 

U. 


in 

m 

in 

in 

n 

n 

in 

in 

in 

in 

ro 

s D 

r^ 

j" 

j- 

s 

S3 

S3 

o 

CM 

CM 

(M 

ro 

ro 

CM 

j* 

ro 

CM 

ro 

IT* 

in 

in 

in 

in 

m 

in 

in 

in 

in 

ro 

CP 

O' 

m 

ro 

•o 

in 

N. 

ro 

CM 

CM 

CM 

CM 

CM 

CM 

ro 

CM 

CM 

ro 

Lfl 

in 

in 

in 

in 

m 

in 

in 

in 

in 

CM 

o 

a 

in 

N. 

N. 

** 

CM 

■-T 

ro 

CM 

ro 

CM 

CM 

ro 

ro 

ro 

in 

in 

-n 

in 

in 

in 

in 

tn 

in 

in 

in 

o 

CM 

CM 

CM 

N. 

Si 

ro 

in 

in 

CM 

fO 

ro 

ro 

CM 

CM 

ro 

CM 

ro 

in 

in 

in 

in 

in 

S\ 

in 

in 

in 

in 

-r 

CO 

CM 

CM 

<D 

N. 

in 

cm 

ro 

ro 

ro 

ro 

CM 

CM 

ro 

CM 

ro 

in 

n 

in 

in 

in 

in 

in 

in 

in 

in 

N» 

CP 

in 

ro 

j* 

in 

S3 

h«» 

rw 

INJ 

CM 

ro 

CM 

CM 

CM 

CM 

ro 

CM 

ro 

in 

in 

in 

in 

in 

in 

in 

in 

in 

in 

ro 

o' 

S3 

'-O 

in 

O' 

CM 

ro 

ro 

CM 

ro 

CM 

CM 

CM 

CM 

CM 

in 

in 

in 

in 

in 

in 

in 

in 

in 

in 

CM 

o 

ro 

CM 

ro 

OD 

Si 

in 

CM 

CM 

ro 

ro 

CM 

ro 

CM 

ro 

CM 

CM 

ro 

in 

in 

in 

in 

in 

in 

in 

in 

-n 

in 

s 

o 

ro 

N* 

ro 

a> 

CM 

CM 

CM 

ro 

CM 

CM 

CM 

CM 

CM 

ro 

CM 

j* 

in 

in 

in 

in 

in 

in 

in 

in 

in 

in 

<30 

a? 

in 

S3 

-r 

S3 

m 

fO 

in 

m 

* f 

CM 

CM 

CM 

CM 

CM 

CM 

CM 

ro 

in 

n 

in 

in 

S\ 

Si 

in 

in 

in 

in 

Si 

ro 

j- 

no 

J* 

Si 

Si 

CM 

O' 

ro 

•r 

CM 

CM 

CM 

ro 

CM 

CM 

CM 

CM 

in 

n 

in 

m 

m 

in 

in 

in 

Si 

in 

N- 

CM 

CM 

in 

S3 

CM 

CM 

N. 

ro 

CM 

•T 

CM 

ro 

ro 

ro 

ro 

CM 

CM 

in 

in 

in 

in 

in 

in 

in 

in 

in 

in 

-T 

in 

CM 

CM 

ro 

CM 

S3 

s 

-T 

-r 

CM 

ro 

CM 

ro 

J* 

CM 

CM 

CM 

ro 

in 

in 

in 

in 

in 

in 

in 

in 

in 

in 

ro 

ro 

CM 

S3 

in 

S3 

S3 

ro 

ro 

CM 

ro 

sr 

ro 

CM 

CM 

CM 

in 

in 

*n 

in 

in 

in 

in 

tn 

in 

in 

ro 

a> 

O' 

♦ 

H 

CM 

rs. 

S3 

CM 

CM 

CM 

CM 

nj 

•T 

ro 

ro 

CM 

ro 

in 

in 

in 

in 

in 

in 

in 

in 

in 

in 

o 

S o 

sjO 

no 

ro 

S3 

p- 

n- 

CM 

ro 

ro 

ro 

CM 

CM 

ro 

ro 

CM 

ro 

CM 

ro 

in 

in 

in 

in 

in 

in 

in 

in 

in 

in 

ro 

«X > 

CM 

m 

p- 

in 

O' 

in 

ro 

CM 

CM 

ro 

ro 

ro 

CM 

CM 

ro 

ro 

in 

in 

in 

in 

in 

ip 

in 

in 

in 

in 

o 

j* 

ro 

s 0 

-r 

ro 

S3 

in 

S3 

* 

ro 

ro 

ro 

CM 

ro 

CM 

CM 

ro 

ro 

m 

in 

in 

in 

in 

in 

in 

in 

n 

in 

CM 

ro 

CO 

p- 

in 

ro 

p- 

ro 

S3 

CM 

CM 

CM 

CM 

-f 

CM 

-r 

ro 

ro 

n 

in 

in 

in 

in 

Si 

in 

in 

in 

in 

CM 

N. 

* 

O' 

in 

H 

CT* 

S3 

ro 

J* 

CM 

CM 

CM 

ro 

4» 

CM 

CM 

CM 

ro 

22 


-u 

3 

G. 

c 


a 

E 

(U 

w 


to 


c- 

z 

u 

« 

CL 


£ 

nj 

u 

O' 

0 

Li 

CL 


0) 

L 

3 

O' 


mam mm 


J 


ooooooooooonooooooo 


APPENDIX  A 

THE  PROGRAM  LISTING 


PROGRAM  PRCNTLS 

t  (INPUT , OUTPUT, TAP £5= INPUT, TAP £6 =OUT PUT) 


THIS  PROGRAM  COMPUTES  THE  NONPARAMETRIC  PERCENTILE 
ESTIMATES,  THE  GAUSSIAN  ESTIMATES,  AND  THE  POPULATION 
PERCENTILES  USING  THE  COUNTING  HEThOQ  (OPTIONAL). 


INPUT 

1.  VARIABLE  NAME,  SURVEY  NAME  ( 3  A  1 0 , 1 0 X , 3  A  1 0 ) 

UP  TO  30  CHARACTERS  EACH 

2.  SAhPLE  SIZE,  COUNT  METHOD  INOICAT OR  (15,12) 

1  FOR  THE  COUNT  METHOO  INOICATOR  IF  THE  COUNT  METHOO 

percenctiles  are  desired;  blank  or  zero  if  not. 

3.  OATA  FORMAT  (8A10) 

4.  OATA  (AS  PER  FORMAT) 

5.  REPITIONS  OF  NUMBERS  1-4  AS  DESIRED 


DIMENSION  0(23)  .GAMMA (23)  ,P<23>  ,ITER(23),PCNT(23),XX(242a) 
DIMENSION  SURV£Y(3) ,VRNAM£(3) 

REAL  NWMEAN<23) , £SM£AN(23) 

C 

OATA  0/-2,  326, -1.96, -  1.645,-  1.  282,-1.  03b,-  842  , -.674, 

*  -.524, -.385 ,-.253, -.126, 

*  0.0 , .126, .253, . 385, .524, .674, .842, 1. 036, 1.282, 

•  1.645,1.96,2.326/ 

DATA  P/  1 .  ,  2.5,  5.  ,  10.  ,  15  . ,20  . ,  25.,3  0  . ,  35. ,40  . ,  45.  , 

•  50. ,55. ,60. ,65 . ,70 . ,75. ,80. ,85 .,90. ,95 ., 97.5, 99./ 

DATA  PCNT/23*1,E20/,8LANK/10H  / 

OATA  NP/23/ 

10  CONTINUE 

C  READ  VARIABLE  5,  SURVEY  NAMES 

REA  0  (5  ,  3  00  )  VRNAME, SURVEY 

IF ( VRNAME ( 1) .EQ.3L ANK.OR. EOF (5) .GT . 0)  STOP 
C  READ  SAMPLE  SIZE 

REAO ( 5 , 301)  NS.ICNT 

C  REAO  TN  SAMPLE 

CALL  ROOAT (XX,X8AR,XSO,NS) 

C  COUNT  METHOO  PERCENTILES 

IF(ICNT .GT . 0)  CALL  CNT PRC N ( XX , NS , PCNT ) 

C  CALCULATE  GAUSSIAN  ESTIMATES 

00  40  1=1, NP 
GAMMAU  )  =P  (I)  /10  0. 

ESMEAN ( I ) =X8AR*XS0*0 ( I ) 

C  CALCULATE  NON-PARAMETRIC  ESTIMATES 

CALL  nonpar; ESMEAN ( I) , GAMMA (I) , XX, NHMEAN (I )  ,  NS , ITER ( I )  ,XSD) 
40  CONTINUE 


00301000 
00001100 
00301200 
0  0  0  0  1  33  0 
00001433 
00001500 
00  30160  3 
00001700 
00001800 
0  0  0  0  1  9  0  0 
00002000 
00302100 
00002200 
0  0  0  0230  0 
00  0  0240  0 
00  0  Q250  0 
00002600 
00002730 
00002300 
00002900 
00003000 
00003100 
00003230 
00003300 
00003400 
00003500 
00003600 
00003700 
00003800 
00003900 
00004000 
0  0  0  0410  0 
0  0  0  04  20  0 
00004300 
00004^00 
00004500 
00004600 
00004700 
00004800 
00004900 
00005000 
0  0  0  05  10  0 
00005200 
00005300 
00005400 
00005500 
00005600 
00005700 
00005800 
00005900 


WRITE  RESULTS 

WRI TE ( 6 i  20  2)  VRNAME, SURVEY 
WRITE<6,200)  XSAR, X  SO , NS 
IF ( ICNT .  GT  .  0)  GO  TO  SO 
WRITE(6 ,204) 

WRIT £(6,401)  (P  (  J  ) » ESMEAN ( J)  , NWM  EAN ( J )  , J«1,NP) 

GO  TO  10 

50  WRITE(6 , 206) 

WRITE (6 ,4020 <P( J>, ESMEAN (J) , NWMEAN ( J) , PONT (J) , J»l, NP> 

GO  TO  10 

200  FORMAT  <//52X,»H£AN . *,F8.2, 

*  /52X, ‘STANOARO  0EV,.*,F8.2, 

*  /52X, ‘SAMPLE  SIZE  •  « .*  ,  I  6//) 

202  FORMAT (1H1 ,5</) ,45X,36H£STIMATE0  PERCENTILES  OF  SKEWEO  DATA  , 
t  ///,21X,3A10,30X,3A10  )' 

2  04  FORMAT (57X  ,  8 HG AUSS I AN , 9X, 14HN0N PARAMET RIC, / , 37X , 10MPERCENTIL E , 

»  10X.8HESTIMATE,  12X , 6HEST IM ATE , /  ) 

206  FORMAT (47X,8HGAUS3IAN,9X, 14HN0N-PARA METRIC , 9X , 8 HCOUNTING ,/, 27X , 

*  10HPERCENTILE, 1  OX , 8HESTIMATE, 1 2X, 8H ESTIMATE, 1 3 X, 6HMETH00 ,/  ) 

300  FORMAT (  3A10,10X  ,3A10  ) 

301  FORMAT ( 15, 12) 

401  FORMAT (39X,F5«1,12X,F8.2, 12X , F  8 • 2  ) 

402  FORMAT(29X,F5.1,12X,F8.2, 12X ,F8 . 2, 12 X , F8 .2  ) 

END 


00006000 
00006100 
00006200 
00006300 
0  Q  0  06  40  0 
00006500 
00006600 
00006700 
00006800 
00006900 
00007000 
00007100 
00007200 
00007300 
0000740  0 
00007500 
00007600 
00007700 
0000780  0 
000  0790  0 
00008000 
00008 100 
00008200 
00006300 
00008400 
00008500 
00008600 
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SUBROUTINE  ROOAT  (XX  ,X0AR, XSO, NS) 

0  0  0  06  TO  0 

OIHENSION  XX  ( 2*2  0)  » FMT { 8) 

00008800 

c 

00006900 

X8A  R*0 . 

00  00900  0 

xso-o. 

Q  0  0  09 10  0 

c 

READ  INPUT  FORMAT 

00009200 

READ ( 5  » 100 )  FhT 

00009300 

c 

READ  SAMPLE 

00009*00 

REAO(5,FNT>  (XX(I> ,I*1,NS) 

00009500 

c 

CALCULATE  MEAN  (  STO  OEV 

00009600 

00  20  1*1, NS 

00009200 

XBAR*XBAR*XX(I) 

00009600 

XSO=XSO*XX (I) *XX (I) 

00009900 

20 

CONTINUE 

00010000 

c 

00010100 

XSAR=X8AR/NS 

00010200 

XSO*XSO/NS 

00010300 

XSO* XSO- XBA R* *2 

00010*00 

XSO* SORT (XSO) 

00010500 

c 

000  106Q  0 

100 

FORMAT ( 8A1 0) 

00010700 

RETURN 

00010800 

ENO 

00010900 
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SUBROUTINE  N0NPAR( START ,A LPHA , X , ENO , N . INOEX , S 0) 

00011030 

DIMENSION  X ( 242 0 ) 

0  0  0  1  1  10  0 

c 

SET  UP  INITIAL  CONDITIONS 

00011200 

inoex=o 

00011300 

TOP=ST  ART 

000  1140  0 

80TTOM*START 

00011500 

£NO=ST  ART 

00011600 

XN*N 

00011700 

H*SO/XN** . 2 

00011  SO  0 

VALUE*XN»< 2*ALPH A« 1) 

00011900 

OIFF  *.0000  1*XN 

00012000 

t*so/io 

00012100 

c 

CALCULATE  PROS  OF  .LE.  ENO 

00012200 

5 

CONTINUE 

00012300 

IN0EX*IN0£X*1 

00012400 

SUM»0. 

00012500 

DO  10  1*1, N 

00012300 

XX* ( £NO-X< I) ) /H 

00012700 

IF ( XX. L  T . 0 . )  GO  TO  7 

00012800 

SUN  =  SUMU.  -EXP(-XX) 

00012900 

GO  TO  10 

00013000 

7 

SUM  =  SUM- 1 . »£XP(XX) 

00013100 

10 

CONTINUE 

00013200 

c 

HON  CLOSE  ARE  NE  7 

00013300 

OIST*VALUE-SUN 

00013400 

IF(IN0EX.GT.5Q.0R.AQS(0IST) .LE.OIFF)  return 

00013500 

c 

00013600 

IFtOIST.LT .0.)  GO  TO  20 

00013700 

IF<ENO.NE.TOPJ  GO  TO  15 

00013800 

c 

SHIFT  INTERVAL  RIGHT 

00013900 

80TTOH=TOP 

00014000 

TOP  =  TOP*T 

00014100 

ENU*  TOP 

00014200 

GO  TO  5 

00014300 

c 

TAKE  RIGHT  HALF  OF  INTERVAL 

00014400 

15 

BOTTOM* ENO 

00014500 

ENO* (BOTTOM^TOPJ /2 , 

00014600 

GO  TO  5 

00014700 

c 

00014800 

20 

CONTINUE 

0001490  0 

IF ( BOTT OM. NE.ENO)  GO  TO  25 

00015000 

c 

SHIFT  INTERVAL  LEFT 

00015 100 

TOP=dOTTOM 

0001620  0 

80TTQM*80TT0h-T 

00015300 

ENQ*80TT0M 

00015400 

GO  TO  5 

00015500 

c 

TAKE  LEFT  HALF  OF  INTERVAL 

00015600 

25 

TOP*  ENO 

00015700 

ENO*  (80TT0M+T0P) /2 . 

00015800 

GO  TO  5 

00015900 

26 


mm 


-  t- - — w-*  -  '• 


c 


ENO 


00016030 

00016100 
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SUBROUTINE  SORT ( X, N) 

OIMENSI ON  X(l) 

THIS  IS  A  SIMPLE  SORT 

00  100  I»2,N 

IN*I-1 

XX*X(I> 

00  50  J*1,IM 

IFIX(J) .LT.XX)  SO  TO  50 

CALL  SHIFT(X,J,I) 

SO  TO  10  0 

50  CONTINUE 
100  CONTINUE 

RETURN 

ENO 


I 


00016200 
00016300 
00016400 
00016500 
0001660  0 
00016700 
00016800 
00016^00 
00017000 
00017100 
00017200 
00017300 
00017400 
00017500 
00017600 
00017700 


SUBROUTINE  SHIFT  (X,J,I) 

00017800 

OIMENSI ON  XU) 

00017900 

00018000 

INT  =  I- j 

00016100 

XX3  X  (I ) 

00018200 

00  10  K*l,  INT 

00018600 

X(I-K+i)3XU-K) 

00018400 

CONTINUE 

0001830  0 

XIJ)«XX 

0001660  0 

00016700 

RETURN 

0001660  0 

ENO 

00016900 

i 


SUBROUTINE  CNTPRCN (XX.NS.PCNT) 


00019000 


00019100 

OINENSION  XX(2420) , GAHMA( 23) ,P ( 23) ,PCNT(23)  00019200 

OATA  P/  l.,2.5,  5. ,10. ,15. ,20. ,25. ,30. ,35. ,40. ,45.,  00019300 

*  50. ,55. ,60. ,65. ,70. ,75. ,00. ,85. ,90. ,95. ,97. 5,99./  00019400 

00019500 

N*23  00019600 

CALL  SORT  (XX, NS)  00019700 

OO  100  I«1,N  00019800 

G AHM A( I) >P (11/100,  00019900 

H*GAMMA (I) *NS*.5  00020000 

IF(M.GT.O)  PCNT (I) *XX(M)  00020100 

100  CONTINUE  00020200 

RETURN  00020300 

ENO  00020400 
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